Code
!curl -LO https://quarto.org/download/latest/quarto-linux-amd64.deb
!sudo dpkg -i quarto-linux-amd64.debGlobal Trends and Data Storytelling
This report explores global child mortality trends among children aged 5–9, using data from UNICEF. The goal is to identify where the highest death counts are occurring and examine their relationship with factors like national income and geography. By using data storytelling and visualisation, we aim to highlight countries most in need of targeted interventions. Four ggplot2-style charts are used to communicate key insights: a world map, a bar chart of the top 10 countries by deaths, a time-series of selected nations, and a scatterplot showing the correlation between child mortality and Gross National Income (GNI). This report has been developed using Python in Google Colab and is designed to inform evidence-based decision-making for global child health initiatives.
!curl -LO https://quarto.org/download/latest/quarto-linux-amd64.deb
!sudo dpkg -i quarto-linux-amd64.deb
!cp /mnt/data/child_mortality_Unicef-11.ipynb .
!quarto render child_mortality_Unicef-11.ipynb --to html!pip install plotnine geopandasimport pandas as pd
from plotnine import *
import geopandas as gpd
import seaborn as sns
import matplotlib.pyplot as pltdf = pd.read_csv("/content/unicef_indicator_2.csv")
metadata_df = pd.read_csv("/content/unicef_metadata.csv")import geopandas as gpd
from plotnine import *
shape_world = gpd.read_file("https://public.opendatasoft.com/api/explore/v2.1/catalog/datasets/world-administrative-boundaries/exports/shp")
df_map = df[(df["time_period"] == latest_year) & (df["sex"] == "Total")].copy()
df_map = df_map.rename(columns={"alpha_3_code": "iso3", "obs_value": "Deaths"})
shape_merged = shape_world.merge(df_map, on="iso3", how="left")
(
ggplot(shape_merged)
+ aes(fill="Deaths")
+ geom_map()
+ coord_fixed()
+ scale_fill_gradient(
low="#FFFFCC", high="#FF3300", na_value="lightgrey", name="Deaths"
)
+ labs(
title=f"Child Deaths (Aged 5–9) by Country – {latest_year}",
subtitle="Highest death counts seen across Sub-Saharan Africa and South Asia",
x="Longitude",
y="Latitude"
)
+ theme_bw()
+ theme(
figure_size=(12, 6),
plot_title=element_text(size=14, weight="bold"),
plot_subtitle=element_text(size=11, margin={"b": 10}),
axis_title=element_text(size=10),
legend_title=element_text(size=10),
legend_text=element_text(size=9)
)
)The world map reveals a striking global disparity in child mortality among children aged 5–9. The highest concentration of deaths is observed in Sub-Saharan Africa, with countries like Nigeria, the Democratic Republic of Congo, and Ethiopia showing extremely high totals. These figures reflect systemic issues such as weak healthcare infrastructure, limited access to clean water and vaccinations, and widespread poverty. South Asia, particularly India and Pakistan, also reports a significant burden of child mortality. Despite progress in economic development, the region continues to face challenges related to rural health services, malnutrition, and inequitable resource distribution. In contrast, most high-income countries—including those in Europe, North America, and East Asia—record very low or negligible death counts in this age group. This visual contrast underscores the profound global inequality in child health outcomes, which are closely tied to economic status and access to essential services.
from plotnine import *
latest_year = df["time_period"].max()
df_top10 = (
df[(df["time_period"] == latest_year) & (df["sex"] == "Total")]
.nlargest(10, "obs_value")
)
(
ggplot(df_top10, aes(x="reorder(country, obs_value)", y="obs_value", fill="country"))
+ geom_col(show_legend=False, width=0.6)
+ coord_flip()
+ labs(
title=f"Top 10 Countries by Child Deaths (Age 5–9) in {latest_year}",
subtitle="Countries with the highest number of deaths among children aged 5–9",
x="Country",
y="Number of Deaths"
)
+ theme_minimal()
+ theme(
figure_size=(10, 6),
axis_title=element_text(size=12, weight="bold"),
axis_text=element_text(size=10),
plot_title=element_text(size=14, weight="bold"),
plot_subtitle=element_text(size=11, margin={"b": 10}),
panel_grid_major=element_line(color="#dddddd"),
panel_grid_minor=element_blank()
)
+ scale_fill_brewer(type="qual", palette="Set2")
)The bar chart clearly highlights that the majority of child deaths among children aged 5–9 are concentrated in a handful of countries. Nigeria leads by a wide margin, followed by India, the Democratic Republic of the Congo, and Pakistan. These countries account for a significant share of global child mortality due to a combination of large child populations and limited access to healthcare, nutrition, and sanitation. Several of the other countries listed, including Ethiopia, Uganda, and Tanzania, are similarly impacted by economic and infrastructural challenges. The chart emphasizes the urgent need for targeted interventions in these high-burden regions.
selected = ["India", "Nigeria", "Pakistan", "Ethiopia"]
df_time = df[(df["country"].isin(selected)) & (df["sex"] == "Total")]
(
ggplot(df_time, aes(x="time_period", y="obs_value", color="country"))
+ geom_line()
+ labs(
title="Child Mortality Trends (Aged 5–9)",
x="Year",
y="Number of Deaths"
)
+ theme_light()
)The time-series chart highlights evolving patterns in child mortality across four high-burden countries: Ethiopia, India, Nigeria, and Pakistan. India, while starting with the highest number of deaths in the early 1990s, has shown a significant and steady decline over the decades, indicating successful national health programs and improved child welfare. Ethiopia also demonstrates a consistent downward trend, reflecting the impact of international aid, better vaccination coverage, and increased healthcare access. In contrast, Nigeria maintains a persistently high level of child deaths, with minimal improvement over the years—underscoring systemic health infrastructure challenges. Pakistan shows modest progress, but its trend remains relatively flat, suggesting the need for stronger policy interventions and investment in child health. The chart provides a powerful timeline-based view of how different countries are progressing—or stagnating—in efforts to reduce child mortality.
df_gni = df[(df["time_period"] == latest_year) & (df["sex"] == "Total")].rename(columns={"alpha_3_code": "iso3"})
metadata_df = metadata_df.rename(columns={"alpha_3_code": "iso3", "GNI (current US$)": "gni"})
merged_df = pd.merge(df_gni, metadata_df, on="iso3", how="left")
df_plot = merged_df[["country_x", "obs_value", "gni"]].dropna()
df_plot = df_plot.rename(columns={"country_x": "country"})
(
ggplot(df_plot, aes(x="gni", y="obs_value"))
+ geom_point(alpha=0.4, color="#1f77b4")
+ geom_smooth(method="lm", color="darkred", se=True)
+ scale_x_log10(labels=lambda l: [f"${int(v):,}" for v in l])
+ labs(
title="Economic Disparity and Child Mortality (Ages 5–9)",
subtitle="Lower-income countries tend to face higher child death tolls. GNI shown on a log scale.",
x="Gross National Income per Capita (Log Scale, USD)",
y="Number of Deaths (Children Aged 5–9)"
)
+ theme_minimal()
+ theme(
figure_size=(10, 6),
axis_title=element_text(size=12, weight="bold"),
axis_text=element_text(size=10),
plot_title=element_text(size=14, weight="bold"),
plot_subtitle=element_text(size=11, margin={"b": 10}),
panel_grid_major=element_line(color="#dddddd"),
panel_grid_minor=element_blank()
)
)This scatterplot clearly visualizes the relationship between Gross National Income (GNI) per capita and the number of child deaths (aged 5–9). Countries with lower GNI tend to experience significantly higher mortality rates, as seen by the dense clustering of high death counts on the lower end of the GNI scale. The use of a logarithmic axis helps spread out the wide income distribution, making disparities more visible. While higher-income countries appear toward the right of the chart with relatively low child death figures, low- and middle-income nations dominate the left side, often with tens of thousands of deaths. The red regression line supports this inverse relationship, reinforcing the idea that child survival is strongly influenced by a nation’s economic capacity and access to healthcare infrastructure.
The analysis shows that child mortality in the 5–9 age group is heavily concentrated in lower-income regions, particularly Sub-Saharan Africa and South Asia. Countries like Nigeria, India, Pakistan, and the Democratic Republic of the Congo report the highest numbers of child deaths. The scatterplot confirms a strong inverse relationship between national income and child mortality — the lower the GNI, the higher the death toll. Meanwhile, the time-series plot indicates that while countries like India and Ethiopia have made substantial progress over time, others like Nigeria continue to face high mortality rates. This underscores the urgent need for increased global investment in healthcare infrastructure, education, and child protection services in low-income nations. These findings support UNICEF’s mission to promote equity in child health outcomes and reduce preventable deaths through targeted, data-driven policy.